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Abstract 

In 1968, Witsenhausen introduced his famous counterexample where he showed that even in the 
simple linear quadratic static team decision problem, complex nonlinear decisions could outperform any 
given linear decision. This problem has served as a benchmark problem for decades where researchers 
try to achieve the optimal solution. This paper introduces a systematic iterative source-channel coding 
approach to solve problems of the Witsenhausen Counterexample-character. The advantage of the 
presented approach is its simplicity. Also, no assumptions are made about the shape of the space of 
policies. The minimal cost obtained using the introduced method is 0.16692462, which is the lowest 
known to date. 

Index Terms 

Open-loop control systems, decision making, iterative methods, discretization, quantizer design, 
stochastic control. 



I. Introduction 

The most fundamental problem in control theory, namely the static output feedback problem 
has been open since the birth of control theory. The question is whether there is an efficient 
algorithm that can decide existence and find stabilizing controllers, linear or nonlinear, based 
on imperfect measurements and given memory. The static output feedback problem is just an 
instance of the problem of control with information structures imposed on the controllers, which 
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has been very challenging for decision theory researchers. In 1968, Witsenhausen G21 introduced 
his famous counterexample: 

inf E[P 7l 2 (X )+X 2 ] (1) 

7i(-).72(-) 

where 

X 1 = 7l (X )+X , (2) 
X 2 = X 1 - 72 (F 2 ), (3) 

Yi = X , (4) 
Y 2 =X 1 + W, (5) 

X ~ X(0,cr 2 ), and W ~ X(0, 1). Here we have two decision makers, one corresponding to 
7i and the other to 72 . The problem is a two-stage linear quadratic Gaussian control problem, 
where the cost at the first time-step is E[/c 2 7 2 (X )] and E[X|] at the second one. At the first 
time-step, the controller has full state measurement, Y\ = X . At the second time-step, it has 
imperfect state measurement, Y 2 = X% + W. What is different to the classical output feedback 
problem, is that the controller at the second stage does not have information from the past since 
it has no information about the output Yi. Thus, the controller is restricted to be a static output 
feedback controller. Witsenhausen showed that even in the simple linear quadratic Gaussian 
control problem above, complex nonlinear decisions could outperform any given linear decision. 
This problem has served as a benchmark problem for decades where researchers try to achieve 
the optimal solution. It has been pointed out that the problem is complicated due to a so called 
"signaling-incentive", where decisions are not only chosen to minimize a given cost, but also 
to encode information in the decisions in order to signal information to other decision makers 
in the team. In the example above, decision maker 2 measures Y 2 = X + 7 i(X ) + W, so its 
measurement is affected by decision maker 1 through 7l . Hence, decision maker 1 not only tries 
to optimize the quadratic cost in ©, but also signal information about X to decision maker 2 
through its decision, 7 i(X ). 

Previous work has been pursued on understanding the Witsenhausen Counterexample. Sub- 
optimal solutions where found in [[131 studied variations of the problem when the signaling 
incentive was eliminated. In |[T2~1|. 031, connections to information theory where studied. An 
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Fig. 1: Schematic view of the system. 



extensive study of the information theoretic connection was made in 0, where it was shown 
that coupling between decision makers in the cost function introduced the nonlinear behavior of 
the optimal strategies. An ordinal optimization approach was introduced in A3 and a hierarchical 
search approach was introduced in [fl6l . where both rely on a given structure of the decisions. 
The first method that showed that optimal strategies may have "slopes" to the quantizations was 
given in [2J. Solutions with bounds are studied in |fTTfl. A potential games approach in the paper 
by ifTTl found the best known value to the date of its publication, namely 0.1670790. 

In this paper, we will introduce a generic method of iterative optimization based on ideas 
from source-channel coding [0, [0, [[131 . fl2T], that could be used to solve problems of the 
Witsenhausen Counterexample character. The numerical solution we obtain for the benchmark 
problem is of high accuracy and renders the lowest value known to date, 0.16692462. In 
the following, p(-) and p(-\-) denote probability density functions (pdfs) and conditional pdfs, 
respectively. 

II. Iterative Optimization 

We will now present an iterative design algorithm, based on person-by-person optimality, for 
solving the minimization in equation (0Q). The method we propose is related to the Lloyd-Max 
algorithm ifTOl . lfi~8l . [fl9l that is successfully used when designing quantizers. A quantizer can 
be described by its partition cells and their corresponding reproduction value. The partition cells 
define to which codeword analog values are encoded and the reproduction values define how 
the analog value is reproduced from the codeword. In general, there is no explicit, closed-form 
solution to the problem of finding the optimal quantizer iflOl . The key idea of the Lloyd-Max 
algorithm is to assume that either the partition cells or the reproduction values are fixed; with one 
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part fixed, it is straightforward to derive an optimal expression for its counterpart. Next one part 
at a time is optimized in an iterative fashion. The Lloyd-Max algorithm has been generalized 
and used in various joint source-channel coding applications. See for example fl6l, Q, Il24l . 
where quantization for noisy channels is studied, []8], where bandwidth compression mappings 
are designed, and lfl5l . [1211 . where systems for distributed source coding and cooperative 
transmission are optimized. The original Lloyd-Max algorithm converges to the global optimum 
under certain conditions, however, when the system model gets more complicated, as in the joint 
source-channel coding problems, there are no such guarantees. The algorithm can be shown to 
converge, but the convergence point may be only locally optimal. 

The above mentioned joint source-channel coding problems are all very similar in structure 
to the Witsenhausen counterexample. We therefore propose to use a generalization of the Lloyd- 
Max algorithm to this problem; the algorithm involves four key elements: 

1) Formulation of necessary conditions on 71 and 72 such that they are individually optimal 
given that 72 and 71, respectively, are fixed. 

2) Discretization of the "channel" space between 7! and 72 such that X x and the input to 72 
are restricted to belong to a finite set Sl- 

3) Iterative optimization of 71 and 72 to make sure that they, one at a time, fulfill their 
corresponding necessary conditions. 

4) Use of a technique called parameter relaxation that makes the solution less sensitive to 
the initialization. 

A. Necessary Conditions on 71 and 72 

Let us first define the function 71 (xq) = 71 (x ) + x — X\. Without loss of generality, we will 
optimize with respect to 71. The cost we want to minimize is given by 

J^E[A; 2 7 1 2 (X ) + (X 1 - 72(r 2 )) 2 ]. (6) 
By using Bayes' rule and assuming that 72 is fixed, we can rewrite the optimization as 

inf ll p(y 2 \ji(x ))F(x , Ji{x ), 72(2/2)) dy 2 p(xo)dx 
' inf / Pfalxi) F(x ,xx, 72(2/2)) dy 2 p(x )dx 
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where 



F(x , xt, 72(1/2)) = (k 2 (xi - x ) 2 + (xt - j 2 (y 2 )) 2 ^j • 

/hi 



lich states that interchange of minimization 
Furthermore, since the optimal value of J 



In (a) we make use of Theorem 14.60 in 112011 . w 
and integration is possible under certain conditions 
is not —00, the theorem states that 7i(-) can be defined in a pointwise manner. Consequently, a 
necessary condition for 71 to be optimal is given by 

71 (x ) =argmin ( / p{y 2 \xi) F(x ,x 1 ,j 2 (y 2 )) dy 2 ) (7) 

X!&R \ J J 

for almost every x E R. 

If we next assume that 71 is fixed, we see that the first term in © is a constant. The 
minimization of J with respect to 72 is therefore equivalent to 

inf E[(X 1 - T2 (y 2 )) 2 ], (8) 

72(0 

which is the mean-squared error (MSE). It is well known that the MSE is minimized by the 
conditional expected value; hence, 

72(2/2) = E[Xi|y 2 ] (9) 
for almost every y 2 E R, is a necessary condition for 72(2/2) to be optimal. 

B. Discretization 

The expressions given in © and © are impractical to use in our design algorithm because 
they require the functions to be specified for infinitely many input values. To get around this 
problem we introduce a discrete set 

r L — 1 A L-3 .Li — 3 L — 1") 

{ - 2 ' - 2 ' " * 2 ' 2 )' (10) 

where L E N and A £ M + are two parameters that determine the number of points and the 
spacing between the points, respectively. Next, we impose the constraint x\ E Si, that is, the 

'in our case the conditions are fulfilled because the integrand is continuous in xo and xi — 71 (xo) and the infimum is over 
the space of all measurable functions. 
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output of 71 can only take one out of a finite number of values. In a similar way, the input to 
72 is discretized such that, 

72(2/2) = 72(2/2), m = Qs L (v*) e 5i, (11) 

where Qs L {y2) maps y 2 to the closest point in the set <S L . 72 can now be stored in the form of 
a lookup table where each point in Sl is associated with an output value. The approximation of 
the real space with Sl can be made more and more accurate by decreasing A and increasing Z^. 
Finally, since X is still infinite-dimensional, we use Monte-Carlo samples of X to represent 
the input to 71. 71 is now specified by evaluating 

71 (x ) = arg min } p(yz\xi) F{x ,x 1 ,%{y 2 )) (12) 

for each of the Monte-Carlo samples that represent X . In a similar way, 72 can be expressed 
as 

72(272) =E[X 1 \y 2 ], (13) 

for all 2/2 G 5l, where the expectation with respect to X is evaluated by using the Monte-Carlo 
samples. 



C. Design Algorithm Using Parameter Relaxation 

Given the above expressions for 71 and 72 it is possible to optimize the system iteratively. 
One common problem with iterative techniques is that the final solution will depend on the 
initialization of the algorithm. If the initialization is bad we are likely to end up in a poor 
local minimum. One method that has proven to be helpful in counteracting this in joint source- 
channel coding is noisy channel relaxation (NCR) (H, flU, El, [i2TI . In this paper, we use 
a generalization of NCR which we call parameter relaxation (PR). The idea of PR is to first 
define a parameter space V that include relevant system parameters such as noise variance, 
power constraints, etc. Assuming that we have found a system that performs well for a system 
parameter r\ n E V, this system is then used as initialization when designing a new system for 
a parameter r] n+ i = r] n + e E V. This update procedure is continued until i] n = t]t, which is 

2 While decreasing A, one has to increase L to make sure that max(i £ Sl) = A(L — l)/2 does not decrease. 
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the target system parameter (i.e., the system parameter for which we want to find the optimized 
system.). 

The problem in the PR method is to determine a good starting point r] as well as the path 
to reach i] T . In joint source-channel coding, the most common parameter to change is the noise 
variance of the channel. The optimization starts with a high noise variance which is gradually 
decreased to the target noise variance, hence the name NCR. In the Witsenhausen setup, we 
have found that the parameter k is useful to include in the parameter space: Design a system for 
a high value of k first and then gradually decrease k until the desired value of k : T is reached. 
The reason to start with a high value of k is that the design algorithm will find a solution 
where ji(xo) ~ xq in this case (i.e., 71 (xo) ~ 0) independently of 72. The design procedure 
including the PR part is given in Algorithm 1. Each update on line 7 and 8 in Algorithm 1 will 
decrease the cost. Since the cost is lower bounded, it is clear that the algorithm will converge. 
It may happen that the algorithm converges to a local optimum, however, as will be seen in the 
following section the local optima we obtain are still better than any previously reported results. 

Algorithm 1 Design Algorithm 

Require: Initial mapping of 72, the value kx for which the system should be optimized and the 

threshold 5 that determines when to stop the iterations. 
Ensure: Locally optimized 71 and 72 . 

l: Let k > kr- 

2: while k > k T do 

3: Decrease k according to some scheme (e.g., linearly). 

4: Set the iteration index i = and J® = 00. 



5 



repeat 



6: 



Set i = i + 1 



10: 



7: 



9: 



8: 



Find the optimal 71 by using (fl"2)) . 
Find the optimal 72 by using (fl"3T) . 
Evaluate the cost function jW according to ©. 
until (J^-V - J«)/j(*-i) < 5 



11 



end while 
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III. Results 

A. Implementation Aspects 

For the evaluation of the design algorithm we have initially used L = 201 levels and chosen 
A(L) = 10a /(L — 1). We have used 400000 Monte-Carlo samples in the final optimizations 
to represent X . Since it is known that the optimal 71 is symmetric about origin [22], we 
have restricted 71 to have this symmetry by generating only positive Monte-Carlo samples and 
thereafter reflecting the resulting 71-function for negative values of Xq. To be able to compare 
our results to previously reported results, we have set a = 5 and kx = 0.2. However, since we 
are using the PR method, we have initially used the value k = 3 and decreased it according to 
the series {3,2, 1.5, 1,0.6,0.4,0.3,0.2}. Before running the design algorithm, we require 72 to 
be initialized, but due to the PR this has little impact on the final solution and we have used the 
initialization 72 = 0. 

Once we have obtained the solution for k T = 0.2, we have increased the precision by expanding 
the number of points in the discrete set from L to L' and updated 72 according to 

i { 2 L '\y2) = ii L \Q s M)) (14) 

for all y 2 G Sy. Thereafter the inner part of the design algorithm, that is, lines 4-10, have been 
run again to obtain a system optimized for the increased number of points L' . By repeating this 
refinement, the precision increases and the cost decreases as will be shown later. This method 
of refining the precision is similar to the one-way multigrid algorithm that is analyzed in [Q. 
The evaluations of (fl"2l and (fl~3l) have been done using an exhaustive search, therefore, the run 
time is exponential in the number of levels L. 

B. Numerical Results 

During the first steps of the PR k is high. This means that the output of 71 should follow 
the input closely to avoid large costs in the first stage. If continuous outputs were allowed, the 
output would be identical to the input. However, since we are working with a discretized system, 
only outputs from the set Sl are feasible. As k reaches 0.4-0.6 the step behavior of the output 
appears. This particular value of k where the system changes from being affine to have a more 
general shape is consistent with a result in ll2~3l . which states that the optimal cost is less than the 
optimal cost for an affine system if k < 0.564. Depending on the realization of the Monte-Carlo 
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Table 1 

(a) Final cost for different solutions. 



Steps 


Stage 1 


Stage 2 


Total Cost 


Witsennausen 


0.40423088 


U.UUUUzz^z 


(J.4U4^5ozU 


Bansal & oansar |3|' 


0.36338023 


0. 00163460 


occm /ioo 

0.36501483 


Deng & Ho | j | 1 


0.1o»48840 


U.OooO ( 2y0 


O.iyzooloO 


Baglietto et al. (3 






0.1701 


Lee et al. (T6) 


0.13188408 


0.03542912 


0.16731321 


Li et al. [TT) 






0.1670790 


This paper, S-step* 


0.13493778 


0.03201113 


0.16694891 


This paper, S.S-step* 


0.13462186 


0.03230369 


0.16692555 


This paper, 4-step' : 


0.13484828 


0.03207634 


0.16692462 



(b) Costs for different precisions for the 4-step solution. 



L 


M 


Stage 1 


Stage 2 


Total Cost 


201 


16 


0.121042 


0.057641 


0.17868301 


401 


22 


0.130150 


0.038834 


0.16898421 


801 


30 


0.135308 


0.032009 


0.16731642 


1601 


56 


0.134966 


0.032062 


0.16702853 


3201 


110 


0.134868 


0.032081 


0.16694954 


6401 


210 


0.134859 


0.032071 


0.16692966 


12801 


396 


0.134848 


0.032076 


0.16692462 



Costs obtained from (T5). * L = 12801. 



samples we get either a 3.5-step mapping or a 4-step mapping as shown in Fig. [2] (occasionally, 
a 3-step solution has occurred). The total costs for these solutions are stated in Table Hal For ease 
of comparison, we have also included the costs of previously reported results. As can be seen, 
all our mappings have similar performance and all of them give lower costs than the previously 
reported lowest cost — 0.1670790 IH71 . 

In Table |Ib] we show how the cost decreases as the number of points L is increased. The 
method we use to calculate the total cost as well as some notes on the accuracy can be found in 
Appendix [Al The lowest cost we have achieved with our algorithm is 0.16692462. The mapping 
that achieves this cost is the 4-step mapping shown in Fig. [2] with L = 12801 points. Although 
the mapping contains four clear output levels it should be emphasized that each level is slightly 
sloped; this can be seen in Fig. |3l where the first step has been zoomed in. It is reasonable to 
assume that as the precision (i.e., L) increases further, each step of the mapping will converge 
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Fig. 2: 4-step solution (L = 12801) 

to a straight line that is slightly sloped. 

IV. Comparison to Previous Results 

In this section, we will compare the presented method with previous methods and note some 
differences: 

• No structure is assumed for the decision functions. In [O and lfT6ll . monotonicity of the 
decisions was assumed. The space of decisions is assumed to be a normed linear space in 

0. 
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• The design is fully automated and little modeling needs to be done a-priori. In contrast, a 
significant analytic/modeling work was performed before posing the optimization problem 
to be solved in Q, IfToTh and The first two require manual adjustments for the proper 
choice of interval values and signal levels, and the third requires some prior analysis to 
determine a constant "c". In ifTTl . modeling work is needed in converting the problem into 
a potential game. 

V. Conclusions 

In this paper, we introduced a generic method of iterative optimization based on ideas from 
source-channel coding, that could be used to solve problems of the Witsenhausen counterexample 
character. The numerical solution we obtain for the benchmark problem is of high accuracy and 
renders the lowest value known to date, 0.16692462. Also, the design algorithm does not make 
any assumption on the structure of the policies — the solutions are allowed to have arbitrary 
shapes (within the restrictions imposed by the discretization). The results can therefore be seen 
as a confirmation that the step-shaped behavior is beneficial. 

Appendix 

In the design algorithm, tl is specified implicitly by storing the output symbol to which each 
Monte-Carlo sample is mapped. This representation is used when evaluating the cost during 
the iterations in the design algorithm. However, to evaluate the final total cost we need higher 
numerical accuracy. Therefore, the first step in calculating the total cost is to use the sample-based 
representation to find thresholds, Ai, such that 71 can be given on the form 

ji(x ) = a* G S L if Ai < x < Ai+i, (15) 

for % 6 {0, . . . , M — 1}, with A = —00 and Am = 00. That is, the sample-based representation 
of 71, which is explicitly defined only for the Monte-Carlo samples, is transformed to a function 
which is defined for all real numbers. This representation makes it possible to numerically 
evaluate the integrals that are needed to find the total cost 

J = E[k 2 1 2 1 (X ) + (X 1 - l2 (Y 2 )) 2 } 

= E[& 2 (7ipfo) - X ) 2 ] + E[(7!(X ) - %{%))\ (16) 

V v ' V v ' 

=Jl =J 2 
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Fig. 3: Detailed view of the first step in the 4-step solution. 



where 

Ji = p(x )k 2 ( ; yi(xo) - x ) 2 dx 

J XO 

M ~i Mi+i 

= k 2 ^2 p(x )(ai - x ) 2 dx , (17) 

Ji = \ p(a;o,y2)(7i(^o) -72(y2)) 2 dx 
Jx ° m&s L 

= 22 \z2 P (y^\ a i)( a i -TaCl/a)) 2 } / p(aJ )da:o, (18) 
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and 



P 



{yMi) =11 p{ w = V2~ oci)d.y 2 if y 2 = A^ 1 




V2-A/2 
/•y 2 +A/2 



J/2+A/2 

p(w = y 2 - ai)dy 2 if y 2 = 



(19) 



/ p(w = y 2 — a{)dy 2 otherwise 

Jy 2 -A/ 2 



All integrals have been calculated numerically using the Matlab function quad! with the tolerance 
specified to be t = 10~ 18 , which means that the absolute error of the result from quadl is not 
greater than t. All integrands are continuous and have a smooth behavior that should cause no 
problem for quadl. To upper bound the total cost, we have upper bounded each integral by 
adding t to each individual result from quadl and reevaluated the total cost. In this way we 
have estimated the absolute error to be in the order of (or less than) 10 -11 . Matlab code for our 
calculations of the total cost, including our decision functions can be found in [QQ|. 



[2] M. Baglietto, T. Parisini, and R. Zoppoli. Numerical solutions to the Witsenhausen counterexample by approximating 

networks. IEEE Trans. Automatic Control, 46(9): 147 1-1477, September 2001. 
[3] R. Bansal and T. Basar. Stochastic teams with nonclassical information revisited: When is an affine law optimal? IEEE 

Trans. Automatic Control, 32(6):554— 559, June 1987. 
[4] C.S. Chow and J. N. Tsitsiklis. An optimal one-way multigrid algorithm for discrete-time stochastic control. IEEE Trans. 

on Automatic Control, 36(8):898-914, 1991. 
[5] M. Deng and Y.C. Ho. An ordinal optimization approach to optimal control problems. Automatica, 35:331-338, 1999. 
[6] N. Farvardin and V. Vaishampayan. Optimal quantizer design for noisy channels: An approach to combined source-channel 

coding. IEEE Trans, on Information Theory, 33(6):827-838, November 1987. 
[7] N. Farvardin and V. Vaishampayan. On the performance and complexity of channel-optimized vectorquantizers. IEEE 

Trans, on Information Theory, 37(1): 155— 160, January 1991. 
[8] A. Fuldseth and T. A. Ramstad. Bandwidth compression for continuous amplitude channels based on vector approximation 

to a continuous subset of the source signal space. In International Conference on Acoustics, Speech and Signal Processing 

(ICASSP), pages 3093-3096, Munich, Germany, April 1997. 
[9] S. Gadkari and K. Rose. Noisy channel relaxation for VQ design. In International Conference on Acoustics, Speech and 

Signal Processing (ICASSP), pages 2048-2051, May 1996. 
[10] A. Gersho and R. M. Gray. Vector Quantization and Signal Compression. Kluwer academic publishers, Dordrecht, The 

Netherlands, 1992. 

[11] P. Grover, S. Y. Park, and A. Sahai. The finite-dimensional Witsenhausen counterexample. In ConCom, Seoul, Korea, 
March 2009. 



References 




March 4. 2013 



DRAFT 



14 



[12] Y.C. Ho and T. S. Chang. Another look at the nonclassical information structure problem. IEEE Trans, on Automatic 
Control, 25(3), 1980. 

[13] Y.C. Ho and K.C. Chu. Team decision theory and information structures in optimal control problems — part I. IEEE Trans. 

on Automatic Control, 17(1), 1972. 
[14] Y.C. Ho, M. P. Kastner, and E. Wong. Teams, signaling, and information theory. IEEE Trans, on Automatic Control, 23(2), 

1978. 

[15] J. Karlsson and M. Skoglund. Optimized low-delay source-channel-relay mappings. IEEE Trans, on Communications, 
58(5): 1397-1404, May 2010. 

[16] J. T. Lee, E. Lau, and Y.C. Ho. The Witsenhausen counterexample: A hierachical search approach for nonconvex 

optimization problems. IEEE Trans. Automatic Control, 46(3):382-297, March 2001. 
[17] N. Li, R. Marden, and J. S. Shamma. Learning approaches to the Witsenhausen counterexample from a view of potential 

games. In IEEE Conference on Decision and Control, pages 157-162, December 2009. 
[18] S. P. Lloyd. Least Squares Quantization in PCM. IEEE Trans, on Information Theory, 28(2): 129-137, March 1982. 
[19] J. Max. Quantizing for minimum distortion. IRE Trans, on Information Theory, 6:7-12, March 1960. 
[20] R. T. Rockafellar and R. J.B. Wets. Variational analysis, volume 317. Springer, 2009. third printing. 
[21] N. Wernersson, J. Karlsson, and M. Skoglund. Distributed quantization over noisy channels. IEEE Trans, on 

Communications, 57(6): 1693-1700, June 2009. 
[22] H. S. Witsenhausen. A counterexample in stochastic optimum control. SI AM Journal on Control, 6(1): 138-147, 1968. 
[23] Y. Wu and S. Verdu. Witsenhausen's counterexample: a view from optimal transport theory. In IEEE Conference on 

Decision and Control, To appear. 2011. 
[24] K. A. Zeger and A. Gersho. Vector quantizer design for memoryless noisy channels. In IEEE Internationell Conference 

on Communications, pages 1593-1597, Philadelphia, USA, June 1988. 



March 4, 2013 



DRAFT 



25 



20 - 



15 



> 



10 



10 



15 



20 



25 



X 



